net/discovery: File persistence for AddrCache#8839
Merged
Conversation
Sajjon
commented
Jun 12, 2025
CodableAddrCache (Encode/Decode) with TryFrom/From for AddrCacheSerializableAddrCache (Serialize/Deserialize) with TryFrom/From for AddrCache
15fcb7e to
004ca5f
Compare
Sajjon
commented
Jun 16, 2025
Sajjon
commented
Jun 16, 2025
Sajjon
commented
Jun 16, 2025
Sajjon
commented
Jun 16, 2025
SerializableAddrCache (Serialize/Deserialize) with TryFrom/From for AddrCacheAddrCache
…ode with Serialize/Deserialize
35a31a0 to
a5925bf
Compare
alexggh
reviewed
Jun 16, 2025
lexnv
reviewed
Jun 16, 2025
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>
lexnv
reviewed
Jun 16, 2025
|
All GitHub workflows were cancelled due to failure one of the required jobs. |
|
Created backport PR for
Please cherry-pick the changes locally and resolve any conflicts. git fetch origin backport-8839-to-stable2412
git worktree add --checkout .worktree/backport-8839-to-stable2412 backport-8839-to-stable2412
cd .worktree/backport-8839-to-stable2412
git reset --hard HEAD^
git cherry-pick -x ee6d22b94d9a93ac5989d4cce2f20a604b86214b
git push --force-with-lease |
|
Created backport PR for
Please cherry-pick the changes locally and resolve any conflicts. git fetch origin backport-8839-to-stable2503
git worktree add --checkout .worktree/backport-8839-to-stable2503 backport-8839-to-stable2503
cd .worktree/backport-8839-to-stable2503
git reset --hard HEAD^
git cherry-pick -x ee6d22b94d9a93ac5989d4cce2f20a604b86214b
git push --force-with-lease |
paritytech-release-backport-bot bot
pushed a commit
that referenced
this pull request
Jul 2, 2025
Implementation of #8758 # Description Authority Discovery crate has been changed so that the `AddrCache` is persisted to `persisted_cache_file_path` a `json` file in `net_config_path` folder controlled by `NetworkConfiguration`. `AddrCache` is JSON serialized (`serde_json::to_pretty_string`) and persisted to file: - periodically (every 10 minutes) - on shutdown Furthermore, this persisted `AddrCache` on file will be read from upon start of the worker - if it does not exist, or we failed to deserialize it a new empty cache is used. `AddrCache` is made Serialize/Deserialize thanks to `PeerId` and `Multiaddr` being made Serialize/Deserialize. # Implementation The worker use a spawner which is used in the [run loop of the worker, where at an interval we try to persist the [AddrCache](https://github.com/paritytech/polkadot-sdk/blob/cyon/persist_peers_cache/substrate/client/authority-discovery/src/worker.rs#L361-L372). We won't persist the `AddrCache` if `persisted_cache_file_path: Option<PathBuf>` is `None` - which it would be if [`NetworkConfiguration` `net_config_path`](https://github.com/paritytech/polkadot-sdk/blob/master/substrate/client/network/src/config.rs#L591) is `None`. We spawn a new task each time the `interval` "ticks" - once every 10 minutes - and it uses `fs::write` (there is also a `tokio::fs::write` which requires the `fs` feature flag of `tokio` which is not activated and I chose to not use it). If the worker shutsdown we will try to persist without using the `spawner`. # Changes - New crate dependency: `serde_with` for `SerializeDisplay` and `DeserialzeFromStr` macros - `WorkerConfig` in authority-discovery crate has a new field `persisted_cache_directory : Option<PathBuf>` - `Worker` in authority-discovery crate constructor now takes a new parameter, `spawner: Arc<dyn SpawnNamed>` ## Tests - [authority-discovery tests](substrate/client/authority-discovery/src/tests.rs) tests are changed to use tokio runtime, `#[tokio::test]` and we pass a test worker config with a `tempdir` for `persisted_cache_directory ` # `net_config_path` Here are the `net_config_path` (from `NetworkConfiguration`) which is the folder used by this PR to save a serialized `AddrCache` in: ## `dev` ```sh cargo build --release && ./target/release/polkadot --dev ``` shows => `/var/folders/63/fs7x_3h16svftdz4g9bjk13h0000gn/T/substratey5QShJ/chains/rococo_dev/network/authority_discovery_addr_cache.json'` ## `kusama` ```sh cargo build --release && ./target/release/polkadot --chain kusama --validator ``` shows => `~/Library/Application Support/polkadot/chains/ksmcc3/network/authority_discovery_addr_cache.json` > [!CAUTION] > The node shutdown automatically with scary error. > ``` > Essential task `overseer` failed. Shutting down service. > TCP listener terminated with error error=Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." } > Installed transports terminated, ignore if the node is stopping > Litep2p backend terminated` >Error: > 0: Other: Essential task failed. > ``` > This is maybe expected/correct, but just wanted to flag it, expand `output` below to see log > > Or did I break anything? <details><summary>Full Log with scary error (expand me 👈)</summary> The log ```sh $ ./target/release/polkadot --chain kusama --validator 2025-06-19 14:34:35 ---------------------------- 2025-06-19 14:34:35 This chain is not in any way 2025-06-19 14:34:35 endorsed by the 2025-06-19 14:34:35 KUSAMA FOUNDATION 2025-06-19 14:34:35 ---------------------------- 2025-06-19 14:34:35 Parity Polkadot 2025-06-19 14:34:35 ✌️ version 1.18.5-e6b86b54d31 2025-06-19 14:34:35 ❤️ by Parity Technologies <admin@parity.io>, 2017-2025 2025-06-19 14:34:35 📋 Chain specification: Kusama 2025-06-19 14:34:35 🏷 Node name: glamorous-game-6626 2025-06-19 14:34:35 👤 Role: AUTHORITY 2025-06-19 14:34:35 💾 Database: RocksDb at /Users/alexandercyon/Library/Application Support/polkadot/chains/ksmcc3/db/full 2025-06-19 14:34:39 Creating transaction pool txpool_type=SingleState ready=Limit { count: 8192, total_bytes: 20971520 } future=Limit { count: 819, total_bytes: 2097152 } 2025-06-19 14:34:39 🚀 Using prepare-worker binary at: "/Users/alexandercyon/Developer/Rust/polkadot-sdk/target/release/polkadot-prepare-worker" 2025-06-19 14:34:39 🚀 Using execute-worker binary at: "/Users/alexandercyon/Developer/Rust/polkadot-sdk/target/release/polkadot-execute-worker" 2025-06-19 14:34:39 Local node identity is: 12D3KooWPVh77R44wZwySBys262Jh4BSbpMFxtvQNmi1EpdcwDDW 2025-06-19 14:34:39 Running litep2p network backend 2025-06-19 14:34:40 💻 Operating system: macos 2025-06-19 14:34:40 💻 CPU architecture: aarch64 2025-06-19 14:34:40 📦 Highest known block at #1294645 2025-06-19 14:34:40 〽️ Prometheus exporter started at 127.0.0.1:9615 2025-06-19 14:34:40 Running JSON-RPC server: addr=127.0.0.1:9944,[::1]:9944 2025-06-19 14:34:40 🏁 CPU single core score: 1.35 GiBs, parallelism score: 1.44 GiBs with expected cores: 8 2025-06-19 14:34:40 🏁 Memory score: 63.75 GiBs 2025-06-19 14:34:40 🏁 Disk score (seq. writes): 2.92 GiBs 2025-06-19 14:34:40 🏁 Disk score (rand. writes): 727.56 MiBs 2025-06-19 14:34:40 CYON: 🔮 Good, path set to: /Users/alexandercyon/Library/Application Support/polkadot/chains/ksmcc3/network/authority_discovery_addr_cache.json 2025-06-19 14:34:40 🚨 Your system cannot securely run a validator. Running validation of malicious PVF code has a higher risk of compromising this machine. Secure mode is enabled only for Linux and a full secure mode is enabled only for Linux x86-64. You can ignore this error with the `--insecure-validator-i-know-what-i-do` command line argument if you understand and accept the risks of running insecurely. With this flag, security features are enabled on a best-effort basis, but not mandatory. More information: https://docs.polkadot.com/infrastructure/running-a-validator/operational-tasks/general-management/#secure-your-validator 2025-06-19 14:34:40 Successfully persisted AddrCache on disk 2025-06-19 14:34:40 subsystem exited with error subsystem="candidate-validation" err=FromOrigin { origin: "candidate-validation", source: Context("could not enable Secure Validator Mode for non-Linux; check logs") } 2025-06-19 14:34:40 Starting workers 2025-06-19 14:34:40 Starting approval distribution workers 2025-06-19 14:34:40 👶 Starting BABE Authorship worker 2025-06-19 14:34:40 Starting approval voting workers 2025-06-19 14:34:40 Starting main subsystem loop 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="candidate-validation" 2025-06-19 14:34:40 Starting with an empty approval vote DB. 2025-06-19 14:34:40 subsystem finished unexpectedly subsystem=Ok(()) 2025-06-19 14:34:40 🥩 BEEFY gadget waiting for BEEFY pallet to become available... 2025-06-19 14:34:40 Received `Conclude` signal, exiting 2025-06-19 14:34:40 Conclude 2025-06-19 14:34:40 received `Conclude` signal, exiting 2025-06-19 14:34:40 received `Conclude` signal, exiting 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="availability-recovery" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="bitfield-distribution" 2025-06-19 14:34:40 Approval distribution worker 3, exiting because of shutdown 2025-06-19 14:34:40 Approval distribution worker 2, exiting because of shutdown 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="dispute-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="chain-selection" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="pvf-checker" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="availability-store" 2025-06-19 14:34:40 Approval distribution worker 1, exiting because of shutdown 2025-06-19 14:34:40 Approval distribution worker 0, exiting because of shutdown 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="approval-voting" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="approval-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="chain-api" 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="provisioner" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="availability-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="runtime-api" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="candidate-backing" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="collation-generation" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="gossip-support" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="approval-voting-parallel" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="bitfield-signing" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="collator-protocol" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="statement-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="network-bridge-tx" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="network-bridge-rx" 2025-06-19 14:34:41 subsystem exited with error subsystem="prospective-parachains" err=FromOrigin { origin: "prospective-parachains", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) } 2025-06-19 14:34:41 subsystem exited with error subsystem="dispute-coordinator" err=FromOrigin { origin: "dispute-coordinator", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) } 2025-06-19 14:34:41 Essential task `overseer` failed. Shutting down service. 2025-06-19 14:34:41 TCP listener terminated with error error=Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." } 2025-06-19 14:34:41 Installed transports terminated, ignore if the node is stopping 2025-06-19 14:34:41 Litep2p backend terminated Error: 0: Other: Essential task failed. Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it. Run with RUST_BACKTRACE=full to include source snippets. ``` 🤔 </details> ## `kusama -d /my/custom/path` ```sh cargo build --release && ./target/release/polkadot --chain kusama --validator --unsafe-force-node-key-generation -d /my/custom/path ``` shows => `./my/custom/path/chains/ksmcc3/network/` for `net_config_path` ## `test` I've configured a `WorkerConfig` with a `tempfile` for all tests. To my surprise I had to call `fs::create_dir_all` in order for the tempdir to actually be created. --------- Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: alvicsam <alvicsam@gmail.com> (cherry picked from commit ee6d22b)
|
Successfully created backport PR for |
EgorPopelyaev
added a commit
that referenced
this pull request
Jul 4, 2025
Backport #8839 into `stable2506` from Sajjon. See the [documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md) on how to use this bot. <!-- # To be used by other automation, do not modify: original-pr-number: #${pull_number} --> --------- Co-authored-by: Alexander Cyon <Sajjon@users.noreply.github.com> Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: alvicsam <alvicsam@gmail.com> Co-authored-by: Egor_P <egor@parity.io> Co-authored-by: Alexander Cyon <alex.cyon@parity.io>
ordian
added a commit
that referenced
this pull request
Jul 24, 2025
* master: (91 commits) Add extra information to the harmless error logs during validate_transaction (#9047) `sp-tracing`: Remove `test-utils` feature (#9063) add try-state check for staking roles -- staker cannot be nominator a… (#9034) net/discovery: File persistence for `AddrCache` (#8839) dispute-coordinator: handle race with offchain disabling (#9050) Align parameters for `EventEmitter::emit_sent_event` (#9057) Fetch parent block `api_version` (#9059) [XCM Precompile] Rename functions and improve docs in the Solidity interface (#9023) Cleanup and improvements for `ControlledValidatorIndices` (#8896) reenable 0001-parachains-pvf (#9046) Add optional auto-rebag within on-idle (#8684) Fix flaxy 0003-block-building-warp-sync test - one more approach (#8974) [Staking] [AHM] Fixes insufficient slashing of nominators (and some other small issues). (#8937) chore: Bump bounded-collections dep (#9004) XCMP and DMP improvements (#8860) EPMB/unsigned: fixed multi-page winner computation (#8987) Always send full parent header, not only hash, part of collation response (#8939) revive: Precompiles should return dummy code when queried (#9001) Fix confusing log messages in network protocol behaviour (#8819) Fix pallet_migrations benchmark when FailedMigrationHandler emits events (#8694) ...
alvicsam
added a commit
that referenced
this pull request
Oct 17, 2025
Implementation of #8758 # Description Authority Discovery crate has been changed so that the `AddrCache` is persisted to `persisted_cache_file_path` a `json` file in `net_config_path` folder controlled by `NetworkConfiguration`. `AddrCache` is JSON serialized (`serde_json::to_pretty_string`) and persisted to file: - periodically (every 10 minutes) - on shutdown Furthermore, this persisted `AddrCache` on file will be read from upon start of the worker - if it does not exist, or we failed to deserialize it a new empty cache is used. `AddrCache` is made Serialize/Deserialize thanks to `PeerId` and `Multiaddr` being made Serialize/Deserialize. # Implementation The worker use a spawner which is used in the [run loop of the worker, where at an interval we try to persist the [AddrCache](https://github.com/paritytech/polkadot-sdk/blob/cyon/persist_peers_cache/substrate/client/authority-discovery/src/worker.rs#L361-L372). We won't persist the `AddrCache` if `persisted_cache_file_path: Option<PathBuf>` is `None` - which it would be if [`NetworkConfiguration` `net_config_path`](https://github.com/paritytech/polkadot-sdk/blob/master/substrate/client/network/src/config.rs#L591) is `None`. We spawn a new task each time the `interval` "ticks" - once every 10 minutes - and it uses `fs::write` (there is also a `tokio::fs::write` which requires the `fs` feature flag of `tokio` which is not activated and I chose to not use it). If the worker shutsdown we will try to persist without using the `spawner`. # Changes - New crate dependency: `serde_with` for `SerializeDisplay` and `DeserialzeFromStr` macros - `WorkerConfig` in authority-discovery crate has a new field `persisted_cache_directory : Option<PathBuf>` - `Worker` in authority-discovery crate constructor now takes a new parameter, `spawner: Arc<dyn SpawnNamed>` ## Tests - [authority-discovery tests](substrate/client/authority-discovery/src/tests.rs) tests are changed to use tokio runtime, `#[tokio::test]` and we pass a test worker config with a `tempdir` for `persisted_cache_directory ` # `net_config_path` Here are the `net_config_path` (from `NetworkConfiguration`) which is the folder used by this PR to save a serialized `AddrCache` in: ## `dev` ```sh cargo build --release && ./target/release/polkadot --dev ``` shows => `/var/folders/63/fs7x_3h16svftdz4g9bjk13h0000gn/T/substratey5QShJ/chains/rococo_dev/network/authority_discovery_addr_cache.json'` ## `kusama` ```sh cargo build --release && ./target/release/polkadot --chain kusama --validator ``` shows => `~/Library/Application Support/polkadot/chains/ksmcc3/network/authority_discovery_addr_cache.json` > [!CAUTION] > The node shutdown automatically with scary error. > ``` > Essential task `overseer` failed. Shutting down service. > TCP listener terminated with error error=Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." } > Installed transports terminated, ignore if the node is stopping > Litep2p backend terminated` >Error: > 0: Other: Essential task failed. > ``` > This is maybe expected/correct, but just wanted to flag it, expand `output` below to see log > > Or did I break anything? <details><summary>Full Log with scary error (expand me 👈)</summary> The log ```sh $ ./target/release/polkadot --chain kusama --validator 2025-06-19 14:34:35 ---------------------------- 2025-06-19 14:34:35 This chain is not in any way 2025-06-19 14:34:35 endorsed by the 2025-06-19 14:34:35 KUSAMA FOUNDATION 2025-06-19 14:34:35 ---------------------------- 2025-06-19 14:34:35 Parity Polkadot 2025-06-19 14:34:35 ✌️ version 1.18.5-e6b86b54d31 2025-06-19 14:34:35 ❤️ by Parity Technologies <admin@parity.io>, 2017-2025 2025-06-19 14:34:35 📋 Chain specification: Kusama 2025-06-19 14:34:35 🏷 Node name: glamorous-game-6626 2025-06-19 14:34:35 👤 Role: AUTHORITY 2025-06-19 14:34:35 💾 Database: RocksDb at /Users/alexandercyon/Library/Application Support/polkadot/chains/ksmcc3/db/full 2025-06-19 14:34:39 Creating transaction pool txpool_type=SingleState ready=Limit { count: 8192, total_bytes: 20971520 } future=Limit { count: 819, total_bytes: 2097152 } 2025-06-19 14:34:39 🚀 Using prepare-worker binary at: "/Users/alexandercyon/Developer/Rust/polkadot-sdk/target/release/polkadot-prepare-worker" 2025-06-19 14:34:39 🚀 Using execute-worker binary at: "/Users/alexandercyon/Developer/Rust/polkadot-sdk/target/release/polkadot-execute-worker" 2025-06-19 14:34:39 Local node identity is: 12D3KooWPVh77R44wZwySBys262Jh4BSbpMFxtvQNmi1EpdcwDDW 2025-06-19 14:34:39 Running litep2p network backend 2025-06-19 14:34:40 💻 Operating system: macos 2025-06-19 14:34:40 💻 CPU architecture: aarch64 2025-06-19 14:34:40 📦 Highest known block at #1294645 2025-06-19 14:34:40 〽️ Prometheus exporter started at 127.0.0.1:9615 2025-06-19 14:34:40 Running JSON-RPC server: addr=127.0.0.1:9944,[::1]:9944 2025-06-19 14:34:40 🏁 CPU single core score: 1.35 GiBs, parallelism score: 1.44 GiBs with expected cores: 8 2025-06-19 14:34:40 🏁 Memory score: 63.75 GiBs 2025-06-19 14:34:40 🏁 Disk score (seq. writes): 2.92 GiBs 2025-06-19 14:34:40 🏁 Disk score (rand. writes): 727.56 MiBs 2025-06-19 14:34:40 CYON: 🔮 Good, path set to: /Users/alexandercyon/Library/Application Support/polkadot/chains/ksmcc3/network/authority_discovery_addr_cache.json 2025-06-19 14:34:40 🚨 Your system cannot securely run a validator. Running validation of malicious PVF code has a higher risk of compromising this machine. Secure mode is enabled only for Linux and a full secure mode is enabled only for Linux x86-64. You can ignore this error with the `--insecure-validator-i-know-what-i-do` command line argument if you understand and accept the risks of running insecurely. With this flag, security features are enabled on a best-effort basis, but not mandatory. More information: https://docs.polkadot.com/infrastructure/running-a-validator/operational-tasks/general-management/#secure-your-validator 2025-06-19 14:34:40 Successfully persisted AddrCache on disk 2025-06-19 14:34:40 subsystem exited with error subsystem="candidate-validation" err=FromOrigin { origin: "candidate-validation", source: Context("could not enable Secure Validator Mode for non-Linux; check logs") } 2025-06-19 14:34:40 Starting workers 2025-06-19 14:34:40 Starting approval distribution workers 2025-06-19 14:34:40 👶 Starting BABE Authorship worker 2025-06-19 14:34:40 Starting approval voting workers 2025-06-19 14:34:40 Starting main subsystem loop 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="candidate-validation" 2025-06-19 14:34:40 Starting with an empty approval vote DB. 2025-06-19 14:34:40 subsystem finished unexpectedly subsystem=Ok(()) 2025-06-19 14:34:40 🥩 BEEFY gadget waiting for BEEFY pallet to become available... 2025-06-19 14:34:40 Received `Conclude` signal, exiting 2025-06-19 14:34:40 Conclude 2025-06-19 14:34:40 received `Conclude` signal, exiting 2025-06-19 14:34:40 received `Conclude` signal, exiting 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="availability-recovery" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="bitfield-distribution" 2025-06-19 14:34:40 Approval distribution worker 3, exiting because of shutdown 2025-06-19 14:34:40 Approval distribution worker 2, exiting because of shutdown 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="dispute-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="chain-selection" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="pvf-checker" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="availability-store" 2025-06-19 14:34:40 Approval distribution worker 1, exiting because of shutdown 2025-06-19 14:34:40 Approval distribution worker 0, exiting because of shutdown 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="approval-voting" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="approval-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="chain-api" 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Approval distribution stream finished, most likely shutting down 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="provisioner" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="availability-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="runtime-api" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="candidate-backing" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="collation-generation" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="gossip-support" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="approval-voting-parallel" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="bitfield-signing" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="collator-protocol" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="statement-distribution" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="network-bridge-tx" 2025-06-19 14:34:40 Terminating due to subsystem exit subsystem="network-bridge-rx" 2025-06-19 14:34:41 subsystem exited with error subsystem="prospective-parachains" err=FromOrigin { origin: "prospective-parachains", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) } 2025-06-19 14:34:41 subsystem exited with error subsystem="dispute-coordinator" err=FromOrigin { origin: "dispute-coordinator", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) } 2025-06-19 14:34:41 Essential task `overseer` failed. Shutting down service. 2025-06-19 14:34:41 TCP listener terminated with error error=Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." } 2025-06-19 14:34:41 Installed transports terminated, ignore if the node is stopping 2025-06-19 14:34:41 Litep2p backend terminated Error: 0: Other: Essential task failed. Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it. Run with RUST_BACKTRACE=full to include source snippets. ``` 🤔 </details> ## `kusama -d /my/custom/path` ```sh cargo build --release && ./target/release/polkadot --chain kusama --validator --unsafe-force-node-key-generation -d /my/custom/path ``` shows => `./my/custom/path/chains/ksmcc3/network/` for `net_config_path` ## `test` I've configured a `WorkerConfig` with a `tempfile` for all tests. To my surprise I had to call `fs::create_dir_all` in order for the tempdir to actually be created. --------- Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: alvicsam <alvicsam@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implementation of #8758
Description
Authority Discovery crate has been changed so that the
AddrCacheis persisted topersisted_cache_file_pathajsonfile innet_config_pathfolder controlled byNetworkConfiguration.AddrCacheis JSON serialized (serde_json::to_pretty_string) and persisted to file:Furthermore, this persisted
AddrCacheon file will be read from upon start of the worker - if it does not exist, or we failed to deserialize it a new empty cache is used.AddrCacheis made Serialize/Deserialize thanks toPeerIdandMultiaddrbeing made Serialize/Deserialize.Implementation
The worker use a spawner which is used in the [run loop of the worker, where at an interval we try to persist the AddrCache. We won't persist the
AddrCacheifpersisted_cache_file_path: Option<PathBuf>isNone- which it would be ifNetworkConfigurationnet_config_pathisNone. We spawn a new task each time theinterval"ticks" - once every 10 minutes - and it usesfs::write(there is also atokio::fs::writewhich requires thefsfeature flag oftokiowhich is not activated and I chose to not use it). If the worker shutsdown we will try to persist without using thespawner.Changes
serde_withforSerializeDisplayandDeserialzeFromStrmacrosWorkerConfigin authority-discovery crate has a new fieldpersisted_cache_directory : Option<PathBuf>Workerin authority-discovery crate constructor now takes a new parameter,spawner: Arc<dyn SpawnNamed>Tests
#[tokio::test]and we pass a test worker config with atempdirforpersisted_cache_directorynet_config_pathHere are the
net_config_path(fromNetworkConfiguration) which is the folder used by this PR to save a serializedAddrCachein:devcargo build --release && ./target/release/polkadot --devshows =>
/var/folders/63/fs7x_3h16svftdz4g9bjk13h0000gn/T/substratey5QShJ/chains/rococo_dev/network/authority_discovery_addr_cache.json'kusamacargo build --release && ./target/release/polkadot --chain kusama --validatorshows =>
~/Library/Application Support/polkadot/chains/ksmcc3/network/authority_discovery_addr_cache.jsonCaution
The node shutdown automatically with scary error.
This is maybe expected/correct, but just wanted to flag it, expand
outputbelow to see logOr did I break anything?
Full Log with scary error (expand me 👈)
The log🤔
kusama -d /my/custom/pathcargo build --release && ./target/release/polkadot --chain kusama --validator --unsafe-force-node-key-generation -d /my/custom/pathshows =>
./my/custom/path/chains/ksmcc3/network/fornet_config_pathtestI've configured a
WorkerConfigwith atempfilefor all tests. To my surprise I had to callfs::create_dir_allin order for the tempdir to actually be created.